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Abstract 



We present a quantitative analysis of throwing ability for major league outfielders 
and catchers. We use detailed game event data to tabulate success and failure events 
in outfielder and catcher throwing opportunities. We attribute a run contribution to 
each success or failure which are tabulated for each player in each season. We use 
four seasons of data to estimate the overall throwing ability of each player using a 
Bayesian hierarchical model. This model allows us to shrink individual player esti- 
mates towards an overall population mean depending on the number of opportunities 
for each player. We use the posterior distribution of player abilities from this model 
to identify players with significant positive and negative throwing contributions. Our 
results for both outfielders^ and catcher^ are publicly available. 
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1 Introduction 



The impact of a fielders arm strength on their respective defensive rating has long been 
neglected and unmeasured. Research into outfielder ability has tended to f ocus on es 



timation of differences in fielding range, such as the Ultimate Zone Rating (|Lichtman . 



2003 ) and the recent work by David Pinto ( Pinto , 2006h . Despite this recent advancement 



in methods for quantifying the range of outfielders, there has been less development of 
sophisticated methods of quantifying an outfielders throwing ability as a defensive tool. 
The put-outs and assists statistics are a common but unacceptable summary of outfielder 
throwing ability since it only quantifies successful events, and outfielders are rarely given 
errors for unsuccessful ability to throw out players as a balancing measure of unsuccess- 
ful events. In addition, there is a more subtle effect of throwing ability that is not captured 
by current measures: an outfielder with a reputation of a strong arm will be tested far less 
often and as a result, will save runs over the course of the season by reducing baserun- 
ner attempts to take extra bases. More recent work has begun to address this need by 
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quantifying both hold and kill events (jWalshI, 120070 but does not consider the influence of 



outfield ball-in-play location on these events. 

Research into catcher fielding also shows a lack of sophisticated analyses of throwing 
ability, which is even more necessary since there is limited information available for other 
aspects of catcher fielding. Fielding range on pop-ups, bunts and short groundballs has 
been examined (Pinto, 2006), but these are relatively rare events, and in the case of pop- 
ups, the vast majority have large enough hang times that every catcher makes a successful 
play. More success has beer i achieved in t he study of passed balls and wild pitch es, due to 
the work by David Gassko (|Gasskol, 2005 ) and others. Studies by Keith Woolner ( Woolnei , 



[l999 ) have attempted to quantify differences between catchers in terms of pitch calling, 
but these effects have not been shown to be statistically significant. Previous studies into 
thro wing ability f or catchers have not been satisfactory. Most studies of throwing abil- 



ity (|Tippett|. 119971) have used broad categorizations such as caught stealing percentage, 
which does capture both successful and unsuccessful events to throw out baserunners. 
However, the more subtle issue of attempt prevention is not captured by this statistic: 
a catcher with a reputation for high throwing ability will have less attempts to throw 
baserunners out, since baserunners will be less likely to attempt a stolen base. The pre- 
vention of baserunning also has positive value to a team (though not as much as throwing 
out a baserunner), but this effect is not captured by statistics based only on baserunning 
attempts. 

In this paper, we quantify the relative throwing ability of each individual catchers and 
outfielders by tracking all throwing opportunities in the 2002-2005 seasons and evalu- 
ating the relative success of each fielder relative to the league average on the scale of 
runs saved/cost. We capture both individual ability to throw out baserunners as well as 
individual tendencies to prevent baserunning. We describe our overall methodology for 
catchers and outfielders in Section|2l We focus on catchers and outfielders since the length 
of throws for infielders are so short that differences in throwing ability are impossible to 
separate from fielding ability. In Section |3l we average individual player throwing ability 
across multiple seasons with a simple hierarchical model (and Gibbs sampling imple- 
mentation) that allows for different variances and sample sizes between different players. 
We examine several interesting results from this approach for catchers in Section H] and 
outfielders in Section |5l We conclude with a brief discussion of our analysis. 



2 Evaluation of Throwing Ability 

To track throwing events, we used the Baseball Info Solutions database (■Els', '2007*) which 
tabulates detailed play-by-play data for all games within the 2002-2005 MLB seasons. The 
BIS data contains all hitting and base-running events in each game, and any changes in 
scoring and baserunner configurations as a consequence of each event. In addition, addi- 
tional fielding information is provided for balls put into play: fielders involved in the play 
as well as the location where the ball was fielded. Our evaluation is based on an initial 
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categorization of each event as a baserunning opportunity or not depending on the game 
situation. In order to provide an interpretable and comparable measure for comparing 
individual players, we convert the successes or failures in these opportunities into a runs 
saved/cost measure. The scale of ru ns saved/cost is a natural one and has been used 
previously by Gassko (|Gassko|. 120051 ) in his work on the effects of passed balls and w ild 
pitches. Our overall strategy will be to apply the Expected Runs Matrix ( REM|. 2007[) to 
calculate the run contribution of a successful vs. unsuccessful plays, and reward /punish 
individual players accordingly. These run rewards for individual players will be calcu- 
lated while taking into account the averages across all players in order to come up with a 
runs saved/ cost for each player. 



2.1 Evaluation for Catchers 



Consider all events in the 2002-2005 seasons that involved a base-stealing opportunity. 
For catchers, a successful play could either be catching a baserunner during a stolen base 
attempt, or preventing a baserunner from attempting a stolen base. Steal opportunities 
consisted of five categories: runner on first base, runner on second base, runners on first 
and third base, runners on second base with a runner on first, and runner on first base 
with a runner on second (a distinction is made between the last two categories in order 
to properly track double steals). These five categories can be further divided into fifteen 
subcategories (C = 1, Idots, 15) based on how many outs there were prior to the play in 
question. Steals of home plate are not considered by our analyses since they are usually 
the consequence of the pitcher, not of the catcher. For each catcher P, we tabulated all 
base-stealing opportunities N{P, C) within each subcategory C as well as the number of 
opportunities A{P, C) where the baserunner did attempt a stolen base. If a stolen base 
was attempted, we tabulated the number of attempts that resulted in a successful steal 
S{P, C) and the number of attempts where the baserunner was thrown out F(P, C). We 
also totaled these counts across all catchers in order to establish total values for the entire 
set of catchers. An example of our tabulation is given in Table [1] below. For each game 



Table 1: Tabulation of Base-Stealing Opportunities for 2002 



Catcher 


Situation 


Opportunities 


Attempts 


Stolen 


Caught 




C 


Oc 


Ac 


Bases ^c; 


Stealings Fc 


J. Lopez 


man on 1st, outs 


241 


15 


14 


11 


All 


man on 1st, outs 


12361 


831 


519 


312 



situation C, we want to focus on two important concepts: the situation's propensity for 
steal attempts and the success rate of those steal attempts. We use the totals over all 
catchers together with the opportunities for catcher P to calculate expected counts of 
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stolen bases and caught stealings for catcher P: 



E[^(P, C)] = N{P, C) X YW^^ E[F(P, C)] = N{P, C) x ^^^^^ 

p p 

Returning to the situation in Table[l]we expected J. Lopez to have E[S'(J. Lopez, C = 1)] = 
10.08 stolen bases and E[F(J. Lopez, C = 1)] = 6.07 caught stealings. Comparing to his 
observed totals, we see that J. Lopez had 3.92 more stolen bases and 4.93 more caught 
stealings than expected in 2002. How should J. Lopez be compensated for these individ- 
ual differences in expectation for both stolen bases and caught stealings? 



The Expected Runs Matrix ( REM , 2007h provides us with expected runs for each game 



situation (baserunner configuration x number of outs), which we can use to calculate a 
runs saved/cost value for a particular successful or unsuccessful play. As an example, 
consider again the game situation from Table [H From the Expected Runs Matrix, the 
expected run value R for a (1st base alone, outs) situation is 0.90. If the baserunner 
attempts to steal and is thrown out by the catcher, then the game situation changes to (no 
baserunners, 1 outs) with a corresponding expected run value R of 0.28, which means 
that the catcher has, in expectation, saved his team 0.62 runs. However, if the baserunner 
successfully steals the base, then the game situation changes to (2nd base alone, outs) 
with a corresponding expected run value R of 1.14, meaning that the catcher has saved 
his team -0.24 expected runs (ie. cost his team 0.24 runs). 

More generally, for any game situation C involving a baserunner, we have the run value 
R{C) for that situation. If a stolen base is attempted and a runner is caught stealing, 
the game situation changes from C — > C , and the positive run value for the catcher 
is R{C') — R{C). If a stolen base is attempted and the runner gets a stolen base, the 
game situation changes from C — ^ C" , and the negative run value for the catcher is 
R{C") — R{C). Thus, the total number of runs saved by a catcher P for a particular game 
situation C is: 

CV(P, C) = {F{P, C) - E[F{P, C)]} X {R{C') - R{C)} 

+{S{P,C)-E[S{P,C)]} X {R{C")-R{C)} (1) 

where C is the change to situation C from a caught stealing event, and C" is the change 
to situation C from a stolen base event. Revisiting the example in Table [H we see that in 
2002, CV(J.Lopez, C = 1) = 4.93 x 0.62 + 3.92 x -0.24 = 2.12 total runs saved in that 
game situation. Equation ^ must be evaluated for all fifteen game situations C, giving 
us a total runs saved/ cost of 



CV(P) = 5^{P(P,C)-E[P(P,C)]}x{P(C")-P(C)} 

c 

+{S{P,C)-E[SiP,C)]} X {R{C")-R{C)} (2) 
which is evaluated for each player in each year. 
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2.2 Evaluation for Outfielders 



The tabulation of outfielder throwing opportunities is somewhat more complicated than 
catchers. We want to examine all ball-in-play (BIP) events to an outfielder that also had 
potential baserunning consequences. For example, if a BIP event is a hit into the outfield, 
then it is a throwing opportunity only if there were baserunners on first and/ or second 
base. Hits with baserunners only on third base were not included since it is assumed that 
any baserunner can score from third base on a hit. However, if the BIP event was an out 
(but not the third out), then the event can still be a throwing opportunity if there were 
baserunners on second and/ or third base that could attempt to advance on the play. We 
assume that baserunners will not advance from first base on an out unless there is an- 
other throwing event involved on the same play. We categorize each outfielder throwing 
opportunity into a set of categories C depending on the configuration of baserunners and 
whether the BIP was a hit or an out {H = 1 for hit, H = for out). In order to account 
for distance of the outfielder throw, the outfield surface was divided into a grid of 12 feet 
(X) by 10 feet (Y) zones Z, and each outfielder throwing opportunity was also categorized 
into a particular zone. 

Within each zone Z, and for every combination of baserunner configuration C and hit 
vs. out H, we can tabulate the number of throwing opportunities N{P, Z, C, H) for each 
player P. We break these opportunities down into the number that resulted in thrown out 
baserunners S{P, Z, C, H) and the number that resulted in runner advancements F{P, Z, C, H). 
Similar to our procedure for catchers, we can compare the actual counts for outfielder P 
to their expected counts based on their number of opportunities and the league totals: 



^S{P,Z,C,H)] = N{P,Z,C,H) 



E[F{P,Z,C,H)] = N{P,Z,C,H)- 



E SiP, Z, C, H) 
_p_ 

Y.N{P,Z,C,H) 
p 

j:FiP,Z,C,H) 
_p 

j:n{p,z,c,h) 

p 



Similar to the catcher tabulation, we want to assign run values to the actions of an out- 
fielder i n each thro wing opportunity, which again involves the use of the Expected Runs 



Matrix (jREMl, 120071) . For each throwing opportunity, we have the starting configuration 
of baserunners C, which has a certain run value R{C). If the throwing opportunity results 
in a thrown out baserunner, then our configuration changes to C with run value R{C'), 
and the outfielder has contributed a positive run value of R{C) — R{C'). However, if the 
throwing opportunity results in a runner advancement, then our configuration changes 
to C" with run value R{C"), then the outfielder has contributed a negative run value 
of R{C) — R{C"). We can thus come up with the following total run contribution of an 
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outfielder P relative to the average: 

OV(P) = 5^{^(P,Z,C,i/)-E[^(P,Z,C,i/)]}x{i?(C)-i?(C')} (3) 

Z,C,H 
Z,C,H 

This same evaluation was repeated for each player in each year. 

3 Hierarchical Model for Multiple Seasons 

Our tabulation procedure in section |2] gives us yearly throwing tabulations for 133 catch- 
ers and 500 outfielders. In addition to evaluating catchers and outfielders on a seasonal 
basis, we also use these yearly totals to estimate the overall throwing ability of each 
catcher and outfielder. We use a simple hierarchical normal model designed to shares 
information between players while allowing for differences in variances between players. 
We provide a short introduction t o this model and also refer the reader to more detailed 



discussions in lGelman et al.\ (|2003l ). Consider the general situation of grouped data ie. Yij 
where j = 1, . . . , mj indexes observations within group i and i = 1, . . . , N indexes the 
groups. We model our data as noisy observations centered at a group-specific mean /ij 
and with a group-specific variance af, 

Yij ~ Normal (yUj, af) 

These group-specific means /i; are also modeled as coming from a normal distribution, 

/ij ~ Normal (/io, t^) 

The main parameters of interest are the unobserved means /ij for each group i, which we 
will refer to collectively with the vector fi. Inference for each /ij is based on a balance 
between the observed mean Yj = Yij /mi for that group and the population mean /iq 
across all groups. The details of that balance are determined by the withtn-player variance 
and between-player variance parameters, af and r^. Assuming that the parameters af, 
and /io are known, then the best estimate of jii is: 

+ A/io 

A. = ' m, ^ 1 (5) 

A key consequence of this model is that the estimate fii for a particular group is a compro- 
mise between the shared mean fiQ across groups and the group-specific mean of observed 
data Yj. The number of observations nii and amount of variance within the group af 
controls how much the resulting estimate fii is shrunk towards the population mean hq. 

In reality, the parameters af, and //q are not known themselves. The Bayesian approach 
to this problem assumes prior distributions for these additional parameters. Instead of 



6 



focussing on a single point estimate of these parameters (such as the maximum likelihood 
estimate), we want to calculate the full posterior distribution of all unknown parameters 
© = {fi, cr^, r^, /io) given all observed data Y. Bayes rule is used to calculate this full 
posterior distribution: 

p{Y) 

The posterior distribution provides the entire range of reasonable values for our unknown 
parameters, but we need to summarize this distribution in a principled way. We will use 
two summaries of each parameter in this paper: 

a. the posterior mean: /tj = E(/ij|Yj) and 

b. the 95% posterior interval: {A, B) such that P{A < i^i < B) = 0.95. 

Posterior intervals are a similar concept to confidence intervals except that under the 
Bayesian approach, the parameter is considered to be a random variable taking the 
range of values given by the posterior interval. In contrast, under the classical approach 
Hi would be considered a fixed (but unknown) constant, and the range of values in a 
confidence interval refers to the coverage of this fixed constant across repeated samples. 
Unfortunately, the posterior distribution p{@\Y) for this model is too complicated for 
posterior means and posterior intervals to be calculated analytically, and so we instead 
use a simulation-based approach called the G ibbs sampler to approxirn ate the full pos- 



terior distribution p(G\Y). The Gibbs sampler iGeman and GemanI (| 19841 ) samples values 
from the full posterior distribution p{&\Y) by iteratively sampling one parameter at a 
time from the conditional distribution of that parameter given the current values of all 
other parameters. Specific details about the Gibbs sa mpler for our model are given in 



Appendix |Al and we again refer to lGelman et all (|2003l) for a more involved discussion of 



Gibbs sampling and other simulation-based techniques. 

We now present the actual hierarchical model used for our analysis of catcher throwing 
ability, and the same model is also used for outfielder throwing ability. In this application, 
the "groups" described above represent individual players, and the observations within 
each group are the seasonal arm values for a particular player. Our implemented model 
has the additional complication that we must also account for differences in the number 
of opportunities between player seasons. Let Xij be the catcher run value CV for catcher 
i in season j, and let riij be the number of opportunities for catcher i in season j. In order 
to compare different catchers on the same scale of opportunities, we calculate the average 
number of opportunities rT in a season, and scale each run value by a factor of = riij/n: 

Y. - ^-X ^ 

We model these re-scaled season run values as noisy observations from a underlying 
catcher-specific throwing talent /i,: 

Yij ~ Normal(/ii, a^/n*,) (7) 



7 



In addition to allowing player-specific variances af in model O, we are also using to 
account for the fact that our observations Yij should be more precise in seasons j with 
greater numbers of opportunities Uij. The second level of our model allows information 
to be shared between catchers by assuming common distributions for the catcher-specific 
means /x and variances cr^: 

Hi ~ Normal(/io, erf ~ Inv-x^ (8) 

The parameters and capture the center and spread of the latent catcher-specific 
throwing abilities fii. As discussed before, we will use a Bayesian approach that treats our 
catcher-specific latent throwing abilities fii as random variables, and our inferential goal 
is the posterior distribution of each yUj. As mentioned in our introduction to the normal 
hierarchical model, an important consequence of our model is that our resulting estimates 
of fii will be a compromise between the shared mean /io and the catcher-specific mean Yi 
of scaled run values. The amount of shrinkage towards the shared mean for a particular /ij 
will be a function of the catcher-specific variance af and the number of opportunities 
for that player. 

To complete our model, we posit prior distributions /io ~ N(0,/9) and ~ Inv— x^. 
Hyper-parameter (P, u, 7) values are used that make these prior assumptions non-influential 
on our inference, as discussed in Appendix|Al In Section |4]below, we examine the results 
from our model implementation for catchers. We also implemented this same model for 
our outfielder run values, and the results are given in Section |5l 



4 Results for Catchers 



We focus our inference on the marginal posterior distribution of each /ij, the latent throw- 
ing ability for each catcher i. We calculated the posterior mean and 95% posterior interval 
of /ij for all 133 catchers in our dataset. However, these /i/s were estimated from our 
scaled run values Yij that assumed the same number of opportunities for each catcher. In 
order take into account differences in playing time, we also converted back to the scale of 
the original catcher-specific totals by multiplying each posterior mean and posterior 
interval by the average number of opportunities for that catcher. We use fi* to denote our 
re-scaled throwing contributions for each catcher i, which we call the player's individual 
run contribution. For comparison, we call the scaled posterior mean Hi the scaled run con- 
tribution for each player i. Our posterior means and posterior intervals for the individual 
run contributions of all 133 catchers are publicly availably. In Tabled we give the five 
best and five worst catchers in terms of the posterior mean of their individual run con- 
tributions fi*. We also provide the 95% posterior intervals for each of these catchers, and 
we observe that there is a large amount of variance. This is not unexpected, considering 
that there are at most four seasons of observations for each catcher. Even some of the best 
and worst catchers have posterior intervals that overlap with zero. In Figure [1], we plot 

^http: / /whartonball.blogspot.com/ 2007/ 04/ evaluating-catcher-throwing-ability.html 



8 



Table 2: Catchers in 2002-05 with best and worst individual run contributions /i* 



Five Best Catchers 
Name Mean Interval 


Five Worst Catchers 
Name Mean Interval 


Schneider, Brian 6.55 (3.60, 8.36) 
Molina, Yadier 4.41 (1.89, 5.58) 
Hall, Toby 4.23 (1.74, 6.58) 
Ardoin, Danny 3.67 (-1.08, 6.25) 
Miller, Damian 3.00 (1.25,4.61) 


Piazza, Mike -4.98 (-7.79,-1.05) 
Varitek, Jason -3.20 (-4.74,-1.28) 
Martinez, Victor -3.00 (-4-06, -1.55) 
Zaun, Gregg -2.78 (-4.99, 0.27) 
Fordyce, Brook -2.68 (-4.94, 0.66) 




the 95% posterior interval for all 109 catchers as a function of the posterior mean. Only 
13 of 133 catchers have 95% posterior intervals that do not contain zero (indicated by the 
red line). We also see that players with larger magnitudes of their run contributions also 
tend to have wider intervals for their individual run contributions. 



5 Results for Outfielders 



Similar to our catcher analyses, we focus our outfielder inference on the scaled run con- 
tribution j2i and individual run contribution fi* for each outfielder, which were calculated 
using the same methodology presented in Section |3l Just as before, the scaled run contri- 
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bution /ij are scaled to the same number of opportunities for each outfielder, whereas the 
individual run contribution fi* is re-scaled by the average number of opportunities faced by 
that particular outfielder i. Our posterior means and posterior intervals for the individual 
run contributions of all outfielders are publicly availabl^ 

Table 3: Outfielders in 2002-05 with best and worst individual run contributions /z* 



Ten Best Outfielders 


Ten Worst Outfielders 


Name 


Mean 


Interval 


Name 


Mean 


Interval 


Edmonds, Jim 


8.72 


(4.17, 12.40) 


Brown, Emil 


-16.78 


(-20.82, -7.31) 


Jones, Jacque 


8.66 


(4.22, 12.78) 


Pierre, Juan 


-10.77 


(-16.56, -4.06) 


Taveras, Willy 


8.07 


(0.21, 12.51) 


Lawton, Matt 


-9.43 


(-13.87, -4.29) 


Johnson, Kelly 


7.77 


(1.98, 10.60) 


Sanchez, Alex 


-7.79 


(-12.27, -2.21) 


Sullivan, Cory 


7.37 


(2.24, 10.36) 


Holliday, Matthew 


-7.34 


(-11.52, -1.81) 


Chavez, Endy 


6.52 


(1.26, 10.80) 


Crawford, Carl 


-7.11 


(-10.60, -3.08) 


Guerrero, Vladimir 


6.20 


(-2.54, 13.66) 


Dejesus, David 


-7.00 


(-10.31, -3.17) 


Hidalgo, Richard 


6.18 


(-1.10, 11.92) 


Williams, Bernie 


-6.79 


(-11.28, -1.52) 


Hunter, Torii 


5.97 


(-0.21, 10.71) 


White, Rondell 


-6.56 


(-9.56, -2.76) 


Walker, Larry 


5.85 


(1.33,9.32) 


Magee, Wendell 


-5.97 


(-9.47, -0.93) 



In Table |3l we give the ten best and ten worst outfielders in terms of the posterior mean 
of their individual run contributions yU*, along with 95% posterior intervals. We again ob- 
serve a large amount of variance in the posterior intervals, and the magnitude of the run 
contribution for the best/ worst outfielders is substantially greater than the magnitude of 
the run contribution for the best/ worst catchers. In Figure |2l we plot the 95% posterior 
interval for all 500 outfielders as a function of the posterior mean. Only 60 of 500 catch- 
ers have 95% posterior intervals that do not contain zero (indicated by the red line). We 
again see that outfielders with larger magnitudes (highly positive or negative) of their 
run contributions also tend to have wider intervals for their individual run contributions. 



6 Discussion 

In this paper, we have presented an evaluation of throwing ability for both major league 
catchers and outfielders. For outfielders, we focus on hits or outs on balls in play to 
the outfield when there are runners on base, whereas we focus our evaluation on base- 
stealing situations for catchers. For each player, our methodology tabulates the outcomes 
of their throwing successes and failures while taking into account the game situation for 
each opportunity. As described in Section |2l we convert the performance of each player 
into runs saved/ cost by tabulating the change in expected runs as a consequence of each 

^http: / / whartonball.blogspot.com/ 2007/ 03/ evaluating-fielder-throwing-ability.html 
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Figure 2: 95% Posterior intervals for each outfielder ordered by the posterior mean 




of their throwing actions. The run contribution for each player is calculated relative the 
average player, so a perfectly average defender would have a run contribution of zero. 
The magnitude of run contributions is substantially higher for outfielders compared to 
catchers, which is a consequence of the greater number of throwing opportunities for 
outfielders as well as the generally greater run consequence of those opportunities. 

We use a hierarchical Bayesian model to estimate each player's innate throwing ability 
while accounting for differences between players in terms of variability and number of 
opportunities. This model acts to shrink the run contribution of players with high vari- 
ability or low numbers of opportunities towards the common mean of all players. Based 
on this model, very few outfielders or catchers show significantly superior or inferior 
throwing performance, as defined by their 95% posterior interval excluding zero. The 
small number of statistically significant players is partly explained by the limitation of 
only having a maximum of four years of detailed game play data for each player. Ad- 
ditional seasons of data for these players would reduce their posterior variance, which 
would likely lead to additional players with statistically significant abilities (i.e. 95% pos- 
terior intervals excluding zero). This additional data would also permit the extension of 
our hierarchical model to allow the throwing ability of individual players to change over 
time. It would be difficult to model any time trends with our current four seasons of data, 
but this is a promising area of future research. 

The use of the expected run matrix to evaluate the run consequence of stolen bases leads 
to an interesting consequence for our catcher evaluations. In terms of change in expected 
runs, it is much more valuable for a catcher to throw out a baserunner that is attempting to 
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steal than it is to prevent a baserunner from attempting a stolen base. Some catchers that 
have a reputation for throwing out baserunners will not have as great of a run contribu- 
tion because baserunners will attempt to steal less often, which is not rewarded as highly 
as throwing out a baserunner who does attempt a steal. An example is Ivan Rodriguez 
who is considered to be the best catcher in the game, as evidenced by his 12 gold gloves, 
the most awarded to a individual catcher in the history of the award. However, Rodriguez 
is ranked as only the eighth best catcher by our analyses, in part because he has one of 
the lowest proportion of steal attempts against him (3.47% of baserunning opportunities) 
among regular catchers. In Table IH we compare Ivan Rodriguez to Brian Schneider, the 
top MLB catcher by our analysis. We see that Brian Schneider's overall run contribution 
is aided by the fact that he has a higher attempt percentage (4.54% of baserunning op- 
portunities) relative to Rodriguez. Clearly, the optimal situation for a catcher is to have 
a high success rate on throwing out baserunners but without the reputation for doing so, 
so that baserunners still attempt to steal at a substantial rate. An extreme (and not recom- 
mended) implementation of this strategy would suggest that catchers could deliberately 
fail on a throwing attempt in an relatively unimportant game situation in the hopes that 
baserunners would then be more likely to attempt (and be thrown out) in a more impor- 
tant situation. 



Table 4: Comparison of Brian Schneider and Ivan Rodriguez 



Name 


Attempt % 


Mean 


Interval 


Schneider, Brian 


4.54 % 


6.55 


(3.60, 8.36) 


Rodriguez, Ivan 


3.47% 


2.37 


(-1.18, 5.41) 



Considering further the issue of situational importance, a potential extension of our model 
would be the incorporation of some measure of leverage into the tabulation of throwing 
events. One could argue that catchers or outfielders should be rewarded more for mak- 
ing a successful throw or penalized more for failure in a key situation. It is not clear, 
however, whether the innate throwing ability that we are attempting to capture with our 
method should be dependent on the importance of the situation. The question of whether 
high leverage situations lead to a measurable difference in the perfor mance of individual 
baseball players is a subject of ongoing speculation (eg. Tang ol (|2004l) ). 
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A Hierarchical Model Implementation 

As outlined in Section |3l, we have an observed number of opportunities Uij and run value 
Xij for catcher i in season j. Although the following model implementation is presented 
for our catcher evaluations, we use the same methodology for our outfielder analyses. We 
scale our run values Xij to be on the same scale of opportunities, 

Y. - X ^ 

and then model these scaled run values as 

Yi j ~ Normal (/ii , / n*j ) (9) 

where the parameters of interest are the underlying catcher-specific throwing talent /Xj. 
We share information between catchers by assuming common distributions for the catcher- 
specific means n and variances cr^: 

Hi ~ Normal(/io, t^) a} ~ Inv— (10) 
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Finally, we have the following prior distributions for our common mean /xq and variance 



n2. 



/io ~ Normal(0, j3) 



~ Inverse— X? 



(11) 



The hyper-parameters u, P, and 7 are assumed to be fixed and known, which means that 
we need to calculate the posterior distribution of our remaining unknown parameters 
= (/x,cr2,/io, r^). 



p{e\Y) oc l[l[piY,,\ ) ■p(/ii|/io,r^) ■p{(y}\i^) -piMP) ■pi'T^h) 



i=i j=i 

N 



OC 



n 

i=l 



\ i=l « j=l 



(12) 



where is the number of catchers and rrii is the number of seasons with a non-zero num- 
ber of opportunitie s for catcher i. We wi ll estimate this posterior distribution with a Gibbs 
sampling strategy ()Geman and Gemanl,ll984) which consists of iteratively sampling from 
the following conditional distributions: 

1. p(/x|cr2,/io,r2, Y) 

2. p{cr^\n,i^o,T^,Y) 

3. p(/io|cr2,^,r2, Y) 

4. p(r2|/io,o-2,/x,Y) 

Step 1 can be done individually for each /Xj. The conditional distribution of each /x, given 
the other parameters is 



fii ~ Normal 



v 



(13) 



We see that each catcher-specific throwing talent /Xj is a weighted compromise between 
the observed data ^J*' 'Kj^ij the common mean /ig- The amount of shrinkage towards 
this common mean /xq for a particular catcher is based on their number of opportunities 
and variance af. Catchers with low variance af will not be pulled as much towards the 
common mean /^o- 
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Step 2 can be done individually for each af. The conditional distribution of each af given 
the other parameters is 



(J,- ~ InverseGamma 



mi + u j=i 



v 



(14) 



From equation (|T4l) , we see that our prior distribution on af are made non-influential 
relative to the data by letting z/ — > 0. 

For step 3, we use the following conditional distribution of /io given the other parameters, 

1 



Hq ~ Normal 



^71 



JV _|_ J_ ' N_ < 1 



(15) 



From equation (|T5l) , we see that our prior distribution on /zq are made non-influential 
relative to the data by letting — ^ oo. 

For step 4, we use the following conditional distribution of given the other parameters. 



InverseGamma 



V 



(16) 



/ 



From equation ((T6|) , we see that our prior distribution on are made non-influential 
relative to the data by letting 7 — > 0. 
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